chore(ci): refine Claude Code Action workflows by Sardor-M · Pull Request #4 · mqttRust/coremq_rust

Sardor-M · 2026-05-08T06:53:39Z

Summary

Replace the default-boilerplate @claude handler and the plugin-based auto-review with a tailored pair adapted from a production-tested setup. Same two files, materially different behavior.

`claude.yml` (the `@claude` mention handler)

Concurrency group keyed on issue/PR number. Rapid @claude mentions on the same PR no longer spawn parallel agent runs.
Write permissions on pull-requests and issues so the agent can post comments and edit PR descriptions (was: read-only).
timeout-minutes: 30 to bound runaway jobs.
Model pinned to claude-sonnet-4-6.
Tool allowlist covers the toolchain agents will actually need: cargo check/build/test/clippy/fmt, npm run *, npx prettier/eslint, yarn *, plus gh pr/api, git diff/log/show, and Read/Edit/Write/Glob/Grep.

`claude-code-review.yml` (auto PR review)

Replace the marketplace code-review plugin with an inline reviewer:

4000-line cost guard. Diffs above the threshold are skipped with a comment pointing reviewers at @claude for targeted asks. Keeps spend predictable (~$0.10–0.20 per run on Sonnet 4.6).
Lockfile/asset filtering. pnpm-lock, package-lock, yarn.lock, Cargo.lock, images, and source maps are stripped from the diff before review.
File context for changed code. Both .ts/.tsx/.js/.jsx and .rs files appear as CHANGED (up to 600 lines each, with truncation markers so the reviewer doesn't false-positive on missing identifiers). For TS/JS files, imports are resolved transitively to add RELATED files (200 lines each). Rust mod resolution is non-trivial, so .rs files appear without related-file walking.
CoreMQ-tailored system prompt. Calls out the project-specific bug classes worth flagging: AdminCommand wiring (oneshot reply requirement), packet_id panic class, blocking syscalls in async, channel deadlocks, hardcoded colors, and the en.json / ko.json / uz.json i18n triplet rule. Tells the reviewer not to invent issues from truncated context.
Cost reporting. Each review comment includes diff/context line counts, token usage breakdown, and estimated USD cost.
Concurrency cancels in-progress reviews when a new push lands on the same PR.

What it looks like in practice

A PR review comment will look like:

## 🤖 Claude Code Review

### Summary
<2-3 sentences>

### Issues
- [major] server/coremq-server/src/engine/engine.rs:142 — Match arm drops oneshot reply on Err path; controller will hang.
  Fix: Send the error response back via `let _ = reply.send(Err(...));` before returning.

---
*Reviewed by claude-sonnet-4-6 · diff: 234 lines · context: 812 lines · tokens: 4,219 in / 287 out · cost: $0.0167 · tag `@claude` to ask follow-up questions*

Test plan

Confirm ANTHROPIC_API_KEY is present in repo Actions secrets (already required for the existing workflows — same secret name).
Open this PR and verify claude-code-review.yml posts a review comment within ~1 minute of the push.
Tag @claude in a comment on this PR and verify claude.yml responds.
(optional) Push a 4000+ line diff somewhere and verify the cost guard fires with the skip-notice comment.

Notes

This PR replaces, doesn't add. Same two filenames, same secret. No new infrastructure.
The workflow scope is required to land changes under .github/workflows/ — already on this token.
Companion PR (skills + agent docs) lands separately.

@claude

Replace the default-boilerplate @claude handler and the plugin-based auto-review with a tailored pair: claude.yml - Add concurrency group keyed on issue/PR number so rapid @claude mentions don't spawn parallel agent runs. - Bump permissions to write on pull-requests/issues so the agent can post comments and edit PR descriptions. - Set timeout-minutes: 30. - Pin model to claude-sonnet-4-6 and pass an allowlist covering the cargo / clippy / fmt / test toolchain plus npm / yarn / eslint / prettier / gh, alongside Read/Edit/Write/Glob/Grep. claude-code-review.yml - Replace the marketplace plugin invocation with an inline diff + file-context build + curl-to-Anthropic flow that returns a single review comment with token usage and cost line. - 4000-line cost guard skips oversized diffs with an explanatory comment, keeping spend predictable. - Filter out lockfiles (pnpm/npm/yarn/Cargo.lock) and binary assets from the diff. - Include both .ts/.tsx/.js/.jsx and .rs in the changed-file context. Walk imports for TS/JS only; Rust files appear as CHANGED without related-file resolution (Rust mod resolution is non-trivial). - Tailored system prompt covers CoreMQ specifics: AdminCommand wiring, oneshot reply requirement, packet_id panic class, blocking-in-async, channel deadlocks, theme tokens, and the en/ko/uz i18n triplet rule. - Concurrency cancels in-progress reviews when a new push lands.

github-actions · 2026-05-08T06:54:14Z

🤖 Claude Code Review

Summary

Two CI workflow files replacing boilerplate with a production-tuned pair. The logic is mostly sound but has a few real issues worth fixing.

Issues

Major — claude-code-review.yml:68 — Lockfile/asset filtering only skips the single diff --git header line matching those patterns, leaving the rest of the hunk (the actual removed/added lines) in $DIFF. The grep -v approach needs to strip entire file sections, not just the diff --git line.
Fix: use filterdiff --exclude= (from patchutils) or post-process with awk to drop full hunks for matched paths.

Major — claude-code-review.yml:204 — max_tokens: 1024 is very tight for an 800-word review budget; a complex diff will hit the limit and return a truncated review with stop_reason: max_tokens, which then gets posted silently as a partial comment with no user-visible warning.
Fix: raise to 4096 and add an explicit check on stop_reason == "max_tokens" to append a truncation notice.

Minor — claude-code-review.yml:79 — Empty-diff check (if [ -z "$RAW_DIFF" ]) comes after the DIFF=$(... head -n 8000) assignment, so $DIFF is already computed; order is fine, but $DIFF_LINES will be 0 on empty input and the cost guard would trigger first (0 < 4000 = false… wait, 0 is not > 4000), so the empty check is actually dead if RAW_DIFF is empty because DIFF_LINES=0 passes the guard. The real problem: an empty diff skips review without error but the Post review comment step still runs because skip is set to true and steps.diff.outputs.skip != 'true' guards it — this is fine. Actually the empty check sets skip=true correctly; the only issue is the check should precede the wc -l on filtered diff to avoid the misleading "0 filtered lines" log noise. Minor ordering nit.

Minor — claude.yml:concurrency — group uses || in a shell expression sense but this is a GitHub Actions expression: ${{ github.event.issue.number || github.event.pull_request.number }}. If both are absent (shouldn't happen given the if guard, but e.g. on pull_request_review events github.event.issue is null), the group collapses to claude-agent- (empty suffix), serializing unrelated PRs.
Fix: use ${{ github.event.issue.number || github.event.pull_request.number || github.run_id }}.

Nits

ANTHROPIC_API_KEY is exposed as an env var on the review step and interpolated directly into jq/curl — fine since it only goes to the curl header, but storing the key in /tmp/payload.json (which is written to disk) alongside the diff content is worth noting for self-hosted runners.
The /* */ comment convention check in the system prompt references Rust block comments, but claude.yml uses # line comments per YAML convention — no issue, just confirming it's intentional.

Reviewed by claude-sonnet-4-6 · diff: 384 lines · context: 0 lines · tokens: 6819 in / 809 out · cost: $0.032592 · tag to ask follow-up questions

Sardor-M · 2026-05-08T07:09:57Z

🤖 Claude Code Review

Summary

Two CI workflow files replacing boilerplate with a production-tuned pair. The logic is mostly sound but has a few real issues worth fixing.

Issues

Major — claude-code-review.yml:68 — Lockfile/asset filtering only skips the single diff --git header line matching those patterns, leaving the rest of the hunk (the actual removed/added lines) in $DIFF. The grep -v approach needs to strip entire file sections, not just the diff --git line. Fix: use filterdiff --exclude= (from patchutils) or post-process with awk to drop full hunks for matched paths.

Major — claude-code-review.yml:204 — max_tokens: 1024 is very tight for an 800-word review budget; a complex diff will hit the limit and return a truncated review with stop_reason: max_tokens, which then gets posted silently as a partial comment with no user-visible warning. Fix: raise to 4096 and add an explicit check on stop_reason == "max_tokens" to append a truncation notice.

Minor — claude-code-review.yml:79 — Empty-diff check (if [ -z "$RAW_DIFF" ]) comes after the DIFF=$(... head -n 8000) assignment, so $DIFF is already computed; order is fine, but $DIFF_LINES will be 0 on empty input and the cost guard would trigger first (0 < 4000 = false… wait, 0 is not > 4000), so the empty check is actually dead if RAW_DIFF is empty because DIFF_LINES=0 passes the guard. The real problem: an empty diff skips review without error but the Post review comment step still runs because skip is set to true and steps.diff.outputs.skip != 'true' guards it — this is fine. Actually the empty check sets skip=true correctly; the only issue is the check should precede the wc -l on filtered diff to avoid the misleading "0 filtered lines" log noise. Minor ordering nit.

Minor — claude.yml:concurrency — group uses || in a shell expression sense but this is a GitHub Actions expression: ${{ github.event.issue.number || github.event.pull_request.number }}. If both are absent (shouldn't happen given the if guard, but e.g. on pull_request_review events github.event.issue is null), the group collapses to claude-agent- (empty suffix), serializing unrelated PRs. Fix: use ${{ github.event.issue.number || github.event.pull_request.number || github.run_id }}.

Nits

ANTHROPIC_API_KEY is exposed as an env var on the review step and interpolated directly into jq/curl — fine since it only goes to the curl header, but storing the key in /tmp/payload.json (which is written to disk) alongside the diff content is worth noting for self-hosted runners.

The /* */ comment convention check in the system prompt references Rust block comments, but claude.yml uses # line comments per YAML convention — no issue, just confirming it's intentional.

Reviewed by claude-sonnet-4-6 · diff: 384 lines · context: 0 lines · tokens: 6819 in / 809 out · cost: $0.032592 · tag to ask follow-up questions

@claude review this

claude · 2026-05-08T07:10:18Z

Claude finished @Sardor-M's task in 3m 10s —— View job

Claude Code Review

Read both changed workflow files
Verify the prior bot review's claims
Check for additional issues
Post review

🤖 Claude Code Review

Summary

Solid direction overall — the inline reviewer is more cost-controlled and CoreMQ-tailored than the marketplace plugin it replaces. The earlier bot review on this thread caught the two material bugs; I'll confirm them concretely and add a few it missed.

Confirming the prior bot's findings

Major — claude-code-review.yml:43 — Lockfile/asset filter is broken. CONFIRMED.
The regex ^diff --git.*(pnpm-lock\.yaml|...) only matches the single header line. A pnpm-lock.yaml change in a unified diff produces diff --git, index, ---, +++, @@, and the actual +/- lines — grep -v strips only the first. The remaining hunk lines (which dominate, since lockfiles are huge) stay in $DIFF and consume both context and tokens. The user-facing impact isn't just review-quality noise — a 3500-line PR that's mostly pnpm-lock.yaml churn passes the 4000-line cost guard and burns the budget reviewing irrelevant content.

Awk drops the patchutils dependency:

DIFF=$(echo "$RAW_DIFF" | awk '
  /^diff --git/ { skip = ($0 ~ /(pnpm-lock\.yaml|package-lock\.json|yarn\.lock|Cargo\.lock|\.(png|jpg|svg|ico|map))( |$)/) }
  !skip
' | head -n 8000)

Major — claude-code-review.yml:181 — max_tokens: 1024 is under-provisioned. CONFIRMED.
The system prompt explicitly budgets "under 800 words for large diffs." 800 English words ≈ 1100–1300 output tokens, so a thorough review on a non-trivial PR will hit the cap and stop mid-sentence. There's no stop_reason check at line 197 — it's only logged, not surfaced. Raise to 4096 and gate on stop_reason == "max_tokens":

STOP_REASON=$(echo "$RESPONSE" | jq -r '.stop_reason // ""')
if [ "$STOP_REASON" = "max_tokens" ]; then
  REVIEW="${REVIEW}

> ⚠️ Review truncated at max_tokens — re-tag \`@claude\` for the rest."
fi

Minor — claude.yml:15 — concurrency group can collapse. CONFIRMED.
For a pull_request_review event triggered without an enclosing PR comment, github.event.issue.number is null and github.event.pull_request.number resolves correctly — but if neither is present (defensive case), the group becomes claude-agent- and serializes unrelated events globally. Add a fallback: claude-agent-${{ github.event.issue.number || github.event.pull_request.number || github.run_id }}.

Minor — claude-code-review.yml:47 — empty-diff check ordering.
Agreed it's a cosmetic ordering nit only. Functionally correct.

Additional issues the prior review missed

Minor — claude-code-review.yml:174-189 — no prompt caching, despite cost-counting code that supports it.
Lines 211, 215–216 read and price cache_creation_input_tokens / cache_read_input_tokens, but the request payload never sets cache_control. Those values will always be 0. The system prompt is ~1.5k tokens and identical every run — adding a cache_control: { type: "ephemeral" } breakpoint on the system block cuts the input cost ~90% on the second-and-onward run within a 5-minute window. Worth doing since the cost-reporting infrastructure is already there:

system: [
  { type: "text", text: "Concise code reviewer for the CoreMQ project — ...", cache_control: { type: "ephemeral" } }
]

Minor — claude-code-review.yml:136-138 — TS import regex matches string literals.
s/.*from [\"']([^\"']+)[\"'].*/\1/p matches any line containing from "..." — including console.log("imported from './foo'") or comments. False positives are mostly harmless (resolution fails silently) but inflate context size on files with chatty logging. Anchoring to a leading-keyword pattern helps: s/^[[:space:]]*(import|export)[[:space:]].*from[[:space:]]+["']([^"']+)["'].*/\2/p.

Minor — claude-code-review.yml:108 — src/* alias hardcoded to client/.
Fine for this repo's layout, but if the CoreMQ tree ever grows a second TS package (e.g. a shared lib), src/Foo will silently resolve to client/src/Foo and miss real targets. Worth a comment at minimum so the assumption is discoverable.

Nit — claude-code-review.yml:43 — regex matches .png mid-path.
\.png|\.jpg|... will match diff --git a/docs/screenshots.png.txt b/... (pathological but possible). Anchor with a word boundary or end-of-token: \.(png|jpg|svg|ico|map)( |$). The awk replacement above already does this.

Nit — claude.yml:48 — allowlist patterns lack a separator before *.
Bash(cargo check*) matches cargo check, cargo checkout, cargo checksum-anything. The intent is clearly cargo check followed by args, so Bash(cargo check *) is more precise. Same for cargo build*, cargo test*, etc. Likely benign in practice — there's no cargo checkout — but worth tightening.

On the prior bot's nits

The /tmp/payload.json concern is real but lower-stakes than they implied — ANTHROPIC_API_KEY itself is never written to the file (it's only in the curl header). What's on disk is the diff + system prompt + key-less payload. On GitHub-hosted runners the runner is ephemeral; on self-hosted it's worth knowing the diff hits /tmp before the API call.
The Rust /* */ vs YAML # comment thing is not an issue — the system prompt's convention applies to files under review, not to the workflow YAML itself.

Fix the lockfile filter → · Raise max_tokens and handle truncation → · Add prompt caching →

· branch chore/refine-claude-workflows

Sardor-M · 2026-05-08T07:24:29Z

Summary

Replace the default-boilerplate @claude handler and the plugin-based auto-review with a tailored pair adapted from a production-tested setup. Same two files, materially different behavior.

claude.yml (the @claude mention handler)

Concurrency group keyed on issue/PR number. Rapid @claude mentions on the same PR no longer spawn parallel agent runs.

Write permissions on pull-requests and issues so the agent can post comments and edit PR descriptions (was: read-only).

timeout-minutes: 30 to bound runaway jobs.

Model pinned to claude-sonnet-4-6.

Tool allowlist covers the toolchain agents will actually need: cargo check/build/test/clippy/fmt, npm run *, npx prettier/eslint, yarn *, plus gh pr/api, git diff/log/show, and Read/Edit/Write/Glob/Grep.

claude-code-review.yml (auto PR review)

Replace the marketplace code-review plugin with an inline reviewer:

4000-line cost guard. Diffs above the threshold are skipped with a comment pointing reviewers at @claude for targeted asks. Keeps spend predictable (~$0.10–0.20 per run on Sonnet 4.6).

Lockfile/asset filtering. pnpm-lock, package-lock, yarn.lock, Cargo.lock, images, and source maps are stripped from the diff before review.

File context for changed code. Both .ts/.tsx/.js/.jsx and .rs files appear as CHANGED (up to 600 lines each, with truncation markers so the reviewer doesn't false-positive on missing identifiers). For TS/JS files, imports are resolved transitively to add RELATED files (200 lines each). Rust mod resolution is non-trivial, so .rs files appear without related-file walking.

CoreMQ-tailored system prompt. Calls out the project-specific bug classes worth flagging: AdminCommand wiring (oneshot reply requirement), packet_id panic class, blocking syscalls in async, channel deadlocks, hardcoded colors, and the en.json / ko.json / uz.json i18n triplet rule. Tells the reviewer not to invent issues from truncated context.

Cost reporting. Each review comment includes diff/context line counts, token usage breakdown, and estimated USD cost.

Concurrency cancels in-progress reviews when a new push lands on the same PR.

What it looks like in practice

A PR review comment will look like:
## 🤖 Claude Code Review

### Summary
<2-3 sentences>

### Issues
- [major] server/coremq-server/src/engine/engine.rs:142 — Match arm drops oneshot reply on Err path; controller will hang.
  Fix: Send the error response back via `let _ = reply.send(Err(...));` before returning.

---
*Reviewed by claude-sonnet-4-6 · diff: 234 lines · context: 812 lines · tokens: 4,219 in / 287 out · cost: $0.0167 · tag `@claude` to ask follow-up questions*
Test plan

Confirm ANTHROPIC_API_KEY is present in repo Actions secrets (already required for the existing workflows — same secret name).

Open this PR and verify claude-code-review.yml posts a review comment within ~1 minute of the push.

Tag @claude in a comment on this PR and verify claude.yml responds.

(optional) Push a 4000+ line diff somewhere and verify the cost guard fires with the skip-notice comment.

Notes

This PR replaces, doesn't add. Same two filenames, same secret. No new infrastructure.

The workflow scope is required to land changes under .github/workflows/ — already on this token.

Companion PR (skills + agent docs) lands separately.

@otabek05 Can you confirm these changes ?

Sardor-M mentioned this pull request May 8, 2026

docs: add AGENTS, CONTRIBUTING, CHANGELOG, and skill playbooks #5

Merged

5 tasks

Sardor-M requested a review from otabek05 May 8, 2026 07:23

Sardor-M self-assigned this May 8, 2026

Sardor-M added documentation Improvements or additions to documentation enhancement New feature or request labels May 8, 2026

Sardor-M mentioned this pull request May 8, 2026

docs: rename project to Plexus and rewrite README #6

Merged

4 tasks

Sardor-M merged commit 7328d45 into main May 11, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(ci): refine Claude Code Action workflows#4

chore(ci): refine Claude Code Action workflows#4
Sardor-M merged 1 commit into
mainfrom
chore/refine-claude-workflows

Sardor-M commented May 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Sardor-M commented May 8, 2026

🤖 Claude Code Review

Summary

Issues

Nits

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Sardor-M commented May 8, 2026

Summary

`claude.yml` (the `@claude` mention handler)

`claude-code-review.yml` (auto PR review)

What it looks like in practice

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sardor-M commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

claude.yml (the @claude mention handler)

claude-code-review.yml (auto PR review)

What it looks like in practice

Test plan

Notes

Uh oh!

github-actions Bot commented May 8, 2026

🤖 Claude Code Review

Summary

Issues

Nits

Uh oh!

Sardor-M commented May 8, 2026

🤖 Claude Code Review

Summary

Issues

Nits

Uh oh!

claude Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

🤖 Claude Code Review

Summary

Confirming the prior bot's findings

Additional issues the prior review missed

On the prior bot's nits

Uh oh!

Sardor-M commented May 8, 2026

Summary

claude.yml (the @claude mention handler)

claude-code-review.yml (auto PR review)

What it looks like in practice

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sardor-M commented May 8, 2026 •

edited

Loading

`claude.yml` (the `@claude` mention handler)

`claude-code-review.yml` (auto PR review)

claude Bot commented May 8, 2026 •

edited

Loading

`claude.yml` (the `@claude` mention handler)

`claude-code-review.yml` (auto PR review)